50 research outputs found

    The ViP2P Platform: XML Views in P2P

    Get PDF
    The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on distributed hash table (or DHT, in short) overlay networks. We present ViP2P (standing for Views in Peer-to-Peer), a distributed platform for sharing XML documents based on a structured P2P network infrastructure (DHT). At the core of ViP2P stand distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P allows user queries to be evaluated over XML documents published by peers in two modes. First, a long-running subscription mode, when a query can be registered in the system and receive answers incrementally when and if published data matches the query. Second, queries can also be asked in an ad-hoc, snapshot mode, where results are required immediately and must be computed based on the results of other long-running, subscription queries. ViP2P innovates over other similar DHT-based XML sharing platforms by using a very expressive structured XML query language. This expressivity leads to a very flexible distribution of XML content in the ViP2P network, and to efficient snapshot query execution. ViP2P has been tested in real deployments of hundreds of computers. We present the platform architecture, its internal algorithms, and demonstrate its efficiency and scalability through a set of experiments. Our experimental results outgrow by orders of magnitude similar competitor systems in terms of data volumes, network size and data dissemination throughput.Comment: RR-7812 (2011

    OptimAX: optimizing distributed continuous queries

    Get PDF
    National audienceThe system we propose to present, OptiMAX, applies the principles of distributed query optimization to the problem of distributed evaluation of continuous XML queries. OptiMAX is an optimizer for Active XML documents (AXML in short). It is implemented as a module which can be used next to an AXML peer, and it may be invoked whenever users ask queries on their AXML documents. The optimizer draws up an initial query plan, and then attempts to rewrtite it using a combination of heuristics and cost information in order to improve the plan's performance estimates. An interesting feature is that all plans are AXML documents themselves. When the optimizer has retained a plan, it hands it to the AXML peer, which evaluates it directly following the decisions taken by the optimizer

    Viewing a World of Annotations through AnnoVIP

    Get PDF
    National audienceLe développement de contenus en format numériques a conduit à l'apparition de corpus de documents structurés interconnectés (tels que les pages HTML ou XML) et d'annotations sémantiques, typiquement exprimées en RDF, qui rajoutent des informations sur ces documents. Les annotations sont souvent produites indépendamment des documents. Nous présentons AnnoVIP, une plateforme pair-à-pair capable d'exploiter de manière efficace un corpus de documents annotés, s'appuyant sur un nouveau modèle de vues matérialisées XML, déployées en pair-à-pair

    Materials Cloud, a platform for open computational science

    Full text link
    Materials Cloud is a platform designed to enable open and seamless sharing of resources for computational science, driven by applications in materials modelling. It hosts 1) archival and dissemination services for raw and curated data, together with their provenance graph, 2) modelling services and virtual machines, 3) tools for data analytics, and pre-/post-processing, and 4) educational materials. Data is citable and archived persistently, providing a comprehensive embodiment of the FAIR principles that extends to computational workflows. Materials Cloud leverages the AiiDA framework to record the provenance of entire simulation pipelines (calculations performed, codes used, data generated) in the form of graphs that allow to retrace and reproduce any computed result. When an AiiDA database is shared on Materials Cloud, peers can browse the interconnected record of simulations, download individual files or the full database, and start their research from the results of the original authors. The infrastructure is agnostic to the specific simulation codes used and can support diverse applications in computational science that transcend its initial materials domain.Comment: 19 pages, 8 figure

    Efficient peer-to-peer data management

    No full text
    Internet has led to a fundamental increase of information that is avail- able to its users over the latest years. The users want to express their needs by simple means, such as queries and they want their queries to be evaluated without caring where the data are placed or how the queries are optimized. The work presented in this thesis contributes to the goal of declarative and efficient management of Web content in distributed settings and it is divided into two main chapters. In the first chapter we study OptimAX, an optimizer for the Active XML language which is able to rewrite a given Active XML document to an equivalent document which would, very likely, have smaller execution cost. With OptimAX we focus on the problem of distributed query optimization in the Active XML setting and we present two interesting case studies inspired by the R&D projects in which our group has been involved. In the second chapter, we propose solutions to the optimization problem from a different perspective. We optimize queries using a set of precomputed queries (materialized views). We have developed a peer-to-peer platform, called ViP2P (views in peer-to-peer) that gives to the users the opportunity to publish their XML documents and to specify views over these documents using a tree pattern language. Whenever a user asks a query, the system will try to find views that can be combined in order to find a rewriting equivalent to the asked query. We have carried WAN experiments that show the scalability of the ViP2P platform.Le développement de l'internet a conduit à une grande augmentation de l'information disponible pour les utilisateurs. Ces utilisateurs veulent exprimer leur besoins de manière simple, par l'intermédiaire des requêtes, et ils veulent que ces requêtes soient évaluées sans se soucier où les données sont placées ou comment les requêtes sont évaluées. Le travail qui est présenté dans cette thèse contribue à l'objectif de la gestion du contenu du Web de manière déclarative et efficace et il est composé de deux parties. Dans le premier partie, nous présentons OptimAX, un optimiseur pour la langage Active XML qui est capable de reécrire un document Active XML donné dans un autre document équivalent dont l'évaluation sera plus efficace. OptimAX contribue à résoudre le problème d'optimisation des requêtes distribuées dans le cadre d'Active XML et nous présentons deux études de cas. Dans le deuxième partie, nous proposons une solution au problème de l'optimisation d'un point de vue différent. Nous optimisons des requêtes en utilisant un ensemble des requêtes pré-calculées (vues matérialisées). Nous avons développé une plateforme pair-à-pair, qui s'appelle ViP2P (views in peer-to-peer) qui permet aux utilisateurs de publier des documents XML et de spécifier des vues sur ces documents en utilisant une langage de motifs d'arbres. Quand un utilisateur pose une requête, le système essaiera de trouver des vues qui peuvent être combinées pour construire une réécriture équivalente à la requête. Nous avons fait des expérimentations en utilisant des ordinateurs des différents laboratoires en France et nous avons montré que notre plateforme passe à l'échelle jusqu'à plusieurs GB de données
    corecore